Improve quantization flow #15961

ZhennanQin · 2019-08-21T12:18:49Z

Description

@pengzhao-intel @TaoLv @xinyu-intel @reminisce @KellenSunderland @anirudh2290
Major changes:

Don't need calib_layer, the layer list which needs calibrated will be generated from quantization pass, so user don't need to specify that. Only needed layer get calibrated, thus improve the calibration speed.
Entropy is refactored, only histogram of output is saved, which will help to reduce memory consumption. Entropy calculation is refactored as c++ operator, accuracy is improved, then entropy method can get same speed as naive.

Model	FP32 Accuracy	INT8 Naïve	INT8 entropy_old	INT8 entropy_new
ResNet50-V1	76.340%	76.060%	76.006%	76.053%
Squeezenet 1.0	56.980%	56.790%	55.584%	56.997%
MobileNet 1.0	72.230%	72.060%	71.822%	71.822%
MobileNetV2 1.0	70.270%	69.820%	69.950%	70.016%
Inception V3	77.760%	78.050%	78.019%	78.053%
Inception-BN	72.280%	72.020%	72.084%	71.978%
mean_for_all	70.977%	70.800%	70.578%	70.820%

Add new quantization mode smart, which will automatically decide each op should be quantized or not. This mode will only quantize nodes which have performance benefit(e.g. convolution and FC), and necessary nodes. For example, A is convolution or FC, which will be all quantized. B is Relu or Add, which is quantizable and quantization flow will make decision whether to quantize it or not. C is non-quantized node. For A -> B -> A, B will be quantized as it can pass down int8 data. For C->B->C, A -> B -> C, or C -> B -> A, B won't be quantized.
Add log for quantization flow, this can help to user to understand what quantization flow does and what's changed.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Feature1, tests, (and when applicable, API doc)
Feature2, tests, (and when applicable, API doc)

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2

Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee

Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577

pengzhao-intel · 2019-08-21T12:49:19Z

@KellenSunderland the requests from your team :)

Change-Id: Ia38369d31c33d0f76a671275910729dfce693950

pengzhao-intel · 2019-08-26T13:08:23Z

@ZhennanQin @xinyu-intel please rebase the code and retrigger the CI

Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534

Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e

Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d

Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e

Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f

Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86

pengzhao-intel

It's not easy to pass the CI.

Merging now for the customer request.

ZhennanQin added 3 commits August 21, 2019 19:35

Add mkldnn imlementation for quantized flatten and smart quantize mode

afef734

Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2

Add calibrate op

10543c9

Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee

Fix merge

5aa8a26

Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577

ZhennanQin requested review from anirudh2290, eric-haibin-lin and szha as code owners August 21, 2019 12:18

pengzhao-intel added the MKLDNN label Aug 21, 2019

ZhennanQin force-pushed the smart_quantize_fast branch 2 times, most recently from ad36eb0 to 98261b4 Compare August 22, 2019 00:51

Fix lint

b02c1a7

Change-Id: Ia38369d31c33d0f76a671275910729dfce693950

ZhennanQin force-pushed the smart_quantize_fast branch from 98261b4 to b02c1a7 Compare August 22, 2019 01:07

ZhennanQin added 3 commits August 23, 2019 09:20

Run CI

6e84101

Run CI

dcdde81

Update test_quantization.py

6e309eb

xinyu-intel and others added 8 commits August 26, 2019 21:17

rebase

5ac9d53

Run CI

ced11b8

Fix CI

7c697f6

Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534

Fix CI

8f2490a

Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e

Fix CI

c3a8b94

Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d

Fix CI

3a91337

Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e

Fix CI

fb4fbba

Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f

Fix GPU

eca88d6

Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86

pengzhao-intel approved these changes Aug 29, 2019

View reviewed changes

pengzhao-intel merged commit 3f7b6ee into apache:master Aug 29, 2019

ZhennanQin deleted the smart_quantize_fast branch September 16, 2019 07:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve quantization flow #15961

Improve quantization flow #15961

ZhennanQin commented Aug 21, 2019

pengzhao-intel commented Aug 21, 2019

pengzhao-intel commented Aug 26, 2019

pengzhao-intel left a comment

Improve quantization flow #15961

Improve quantization flow #15961

Conversation

ZhennanQin commented Aug 21, 2019

Description

Checklist

Essentials

Changes

Comments

pengzhao-intel commented Aug 21, 2019

pengzhao-intel commented Aug 26, 2019

pengzhao-intel left a comment

Choose a reason for hiding this comment